Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Imbalanced image classification approach based on convolution neural network and cost-sensitivity
TAN Jiefan, ZHU Yan, CHEN Tung-shou, CHANG Chin-chen
Journal of Computer Applications    2018, 38 (7): 1862-1865.   DOI: 10.11772/j.issn.1001-9081.2018010152
Abstract869)      PDF (804KB)(494)       Save
Focusing on the issues that the recall of minority class is low, the cost of classification is high and manual feature selection costs too much in imbalanced image classification, an imbalanced image classification approach based on Triplet-sampling Convolutional Neural Network (Triplet-sampling CNN) and Cost-Sensitive Support Vector Machine (CSSVM), called Triplet-CSSVM, was proposed. This method had two parts:feature learning and cost sensitive classification. Firstly, the coding method which mapped images to a Euclidean space end-to-end was learned by the CNN which used Triplet loss as loss function. Then, the dataset was rescaled by sampling method to balance the distribution. At last, the best classification result with the minimum cost was obtained by CSSVM classification algorithm which assigned different cost factors to different classes. Experiments with the portrait dataset FaceScrub on the deep learning framework Caffe were conducted. And the experimental results show that the precision is increased by 31 percentage points and the recall of the proposed method is increased by 71 percentage points compared with VGGNet-SVM (Visual Geometry Group Net-Support Vector Machine) in the condition of 1:3 imbalanced rate.
Reference | Related Articles | Metrics
Optimum feature selection based on genetic algorithm under Web spam detection
WANG Jiaqing, ZHU Yan, CHEN Tung-shou, CHANG Chin-chen
Journal of Computer Applications    2018, 38 (1): 295-299.   DOI: 10.11772/j.issn.1001-9081.2017061560
Abstract413)      PDF (807KB)(303)       Save
Focusing on the issue that features used in Web spam detection are always high-dimensional and redundant, an Improved Feature Selection method Based on Information Gain and Genetic Algorithm (IFS-BIGGA) was proposed. Firstly, the priorities of features were ranked by Information Gain (IG), and dynamic threshold was set to get rid of redundant features. Secondly, the function of chromosome encoding was modified and the selection operator was improved in Genetic Algorithm (GA). After that, the Area Under receiver operating Characteristic (AUC) of Random Forest (RF) classifier was utilized as the fitness function to pick up the features with high degree of identification. Finally, the Optimal Minimum Feature Set (OMFS) was obtained by increasing the experimental iteration to avoid the randomness of the proposed algorithm. The experimental results show that OMFS, compared to the high-dimensional feature set, although the AUC under RF is decreased by 2%, the True Positive Rate (TPR) is increased by 21% and the feature dimension is reduced by 92%. And the average detecting time is decreased by 83%; moreover, by comparing to the Traditional GA (TGA) and Imperialist Competitive Algorithm (ICA), the F1 score under Bayes Net (BN) is increased by 4.2% and 3.5% respectively. The experimental results that the IFS-BIGGA can effectively reduce the dimension of features, which means it can effectively reduce the calculation cost, improves the detection efficiency in the actual Web spam detection inspection project.
Reference | Related Articles | Metrics